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Abstract 

We consider a multi-cell multiple-input multiple-output (MIMO) coordinated downlink transmission, also known as network 
MIMO, under per-antenna power constraints. We investigate a simple multiuser zero-forcing (ZF) linear precoding technique known 
as block diagonalization (BD) for network MIMO. The optimal form of BD with per-antenna power constraints is proposed. It 
involves a novel approach of optimizing the precoding matrices over the entire null space of other users' transmissions. An iterative 
gradient descent method is derived by solving the dual of the throughput maximization problem, which finds the optimal precoding 
matrices globally and efficiently. The comprehensive simulations illustrate several network MIMO coordination advantages when 
the optimal BD scheme is used. Its achievable throughput is compared with the capacity region obtained through the recently 
f») [ established duality concept under per-antenna power constraints. 

^ ■ Index Terms 

Base station cooperation, Block diagonalization, Multiuser zero-forcing, Network MIMO, Per-antenna power constraints, 
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I. Introduction 

While the potential capacity gains in point-to-point fl~), (2] and multiuser [3| multiple input multiple output (MIMO) wireless 

i i ' systems are significant, in cellular networks this increase is very limited due to intra and inter-cell interference. Indeed, the 

capacity gains promised by MIMO are severely degraded in cellular environments (4), Q. To mitigate this limitation and 

achieve spectral efficiency increase due to MIMO spatial multiplexing in future broadband cellular systems, a network-level 

. ■ interference management is necessary. Consequently, there has been a growing interest in network MIMO coordination J6)- 

rS) ' ED- Network MIMO coordination is a very promising approach to increase signal to interference plus noise ratio (SINR) 

\Q • on downlinks of cellular networks without reducing the frequency reuse factor or traffic load. It is based on cooperative 

transmission by base stations in multiuser, multi-cell MIMO systems. The network MIMO coordinated transmission is often 

■^J- ■ analyzed using a large virtual MIMO broadcast channel (BC) model with one base station and more antennas [12|-[14|. This 

f^*) I approach increases the number of transmit antennas to each user and hence the capacity increases dramatically compared 

C*~) ■ to conventional MIMO networks without coordination [7|-[9|. Moreover, inter-cell scheduled transmission benefits from the 

'. . I increased multiuser diversity gain [ 15 J. The capacity region of network MIMO coordination as a MIMO BC has been previously 

£> ■ established under sum power constraint Ifl6l - ll20l using uplink-downlink duality. However, the coordination between multiple 

'ku I base stations requires per-base station or even more realistic in practice per-antenna power constraints to be extendable to any 

linear power constraints. Under per-antenna power constraints, uplink-downlink duality for the multi-antenna downlink channel 

C$ [ has been established in ETl . 11221 using Lagrangian duality framework in convex optimization |23| to explore the capacity 

region. It is known that the capacity region is achievable with dirty paper coding (DPC). However, DPC is too complex for 

practical implementations. Consequently, due to their simplicity, linear precoding schemes such as multiuser zero-forcing or 

block diagonalization (BD) are considered 1241 . 11251 . 

The key idea of BD is linear precoding of data in such a way that transmission for each user lies within the null space of 
other users' transmissions. Therefore, the interference to other users is eliminated. Multi-cell BD has been employed explicitly 
for network MIMO coordinated systems in ll26l - ll29l with the diagonal structure of the precoders and the sum power constraint 
11241 . Although there were attempts in these papers to optimize the precoders to satisfy per-base-station and per-antenna power 
constraints, this structure of the precoders is no longer optimal for such power constraints and must be revised l27l . l30l . OTI . 
In [32], the ZF matrix is confined to the pseudo-inverse of the channel for the single receive antenna users with per-antenna 
power constraints. The sub-optimality of pseudo-inverse ZF beamforming subject to per-antenna power constraints was first 
shown in |27l . [30 j presented the optimal precoders' structure using the concept of generalized inverses which lead to a 
non-convex optimization problem and the relaxed form requires semi-definite programming (SDP) 1331 , This is investigated 
only for single-antenna mobile users. OTI also uses the generalized inverses for the single-antenna mobile users, but using a 
multistage optimization algorithms. 

In this work, we aim to maximize the throughput of network MIMO coordination employing multiple antennas both at the 
base stations and the mobile users through optimization of precoding. An optimal form of BD is introduced by extending 
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the search domain of precoding matrices to the entire null space of other users' transmissions [34j. The dual of throughput 
maximization problem is utilized to obtain a simple iterative gradient descent method [23] to find the optimal linear precoding 
matrices efficiently and globally. The gradient descent method applied to the dual problem requires fewer optimization variables 
and less computation than comparable algorithms that have already been proposed in 11261 . 11281 . Il30l . PP . IT351 has employed 
the idea presented in [34] which is optimizing over the entire null space of other users's channels but a sub-gradient algorithm is 
obtained. The sub-gradient method is not a descent method unlike the gradient method and does not use the line search for the 
step sizes (36). Furthermore, our approach is also extended to the case of non-square channel matrices, single-antenna mobile 
users and per-base-station power constraints. In contrast to previous numerical results on network MIMO coordination |26|, 
ll3~7l . [[38 1 assuming the sum power or per-base-station power constraints, in this paper the proposed optimal BD is examined 
with per-antenna power constraints enforced. To consider feasible network MIMO coordination in practice, local coordination 
of base stations is used through clustering [26], [38|, [39|. The results show that the proposed optimal BD scheme outperforms 
the earlier BD schemes used in network MIMO coordination. For the sake of comparison the capacity limits are determined 
employing the uplink-downlink duality idea in MIMO BC under per-antenna power constraint introduced in ETl . lf22l . 

The remainder of this paper is organized as follows. In Section [TT] the system model is introduced, and the network MIMO 
coordination structure, the transmission strategy and the corresponding capacity region are discussed. In Section [III] the multi- 
cell BD scheme is studied and its comparison with the conventional BD is presented, which motivates research on optimal 
multi-cell BD under per-antenna power constraints. The optimal multi-cell BD scheme is proposed in Section IIII-BI and its 
further extensions and generalizations are considered. Comprehensive numerical results are presented in Section [V] following 
the discussion of the simulation setup in Section [IV] Conclusions are given in Section |VI| 

II. System Model 

A. Network MIMO Coordinated Structure 

We consider a downlink cellular MIMO network, with multiple antennas at both base stations and mobile users. Each user 
is equipped with n r receive antennas and each base station is equipped with n t transmit antennas. The base stations across 
the network are assumed to be coordinated via high-speed back-haul links. For a large cellular network of several cells, this 
coordination is difficult in practice and requires large amount of channel state information and user data available at each 
base station. Hence, clustering of the network is applied, where each group of B cells is clustered together and benefits from 
intra-cluster coordinated transmission [26|, [38|, (39]. Hence, within each cluster each user's receive antennas may receive 
signal from all Nt — n t B transmit antennas. The cellular network contains C clusters. The base stations within each cluster 
are connected and capable of cooperatively transmitting data to mobile users within the cluster. Hence, there are two types of 
interference in the network, the intra-cluster and inter-cluster interference. If we define H c ,k,b S C 7vXni to be the downlink 
channel matrix of user k from base station b within cluster c, then the aggregate downlink channel matrix of user k within 
cluster c is a n r x Nt matrix defined as H Cj fe = \H. Ct k,iH-c,k,2 ' ' • H C) fc b]. The aggregate downlink channel matrix for all K 
users scheduled within cluster c, H c G ^Kn r xN t j s defined as jj c _ jjjT ^ , , , jjT^jT^ w jj ere r.y denotes the matrix transpose. 
The multiuser downlink channel is also called broadcast channel (BC) in information theory literature 11401 . Assuming that the 
same channel is used on the uplink and downlink, the aggregate uplink channel matrix is H^, where () H denotes the conjugate 
(Hermitian) matrix transpose 1131 . The multiuser uplink channel is also called multiple-access channel (MAC). In the BC, let 
x c 6 C^** 1 denote the transmitted signal vector (from N t base stations' antennas of cth cluster) and let y c /. e C™rXi be the 
received signal at the receiver of the mobile user k. The noise at receiver k is represented by n c *. e C^r-xi con taining n r 
circularly symmetric complex Gaussian components (n CJ t ~ CA/"(0,er 2 I)). The received signal at the fcth user in cluster c is 
then 

C 
y c .k= H Cjfe x c + 'V Hc,feXc +n C!fc (1) 



Intra-cluster signal 



c— l,c^c 



Inter-cluster interference 



where Hg & represents the channel coefficients from the surrounding clusters c to the /cth user of the cluster c. The transmit 
covariance matrix can be defined as S cx = E [x c x]j] . The base stations are subject to the per-antenna power constraints 
Pi,...,p Nt , which imply 

[Scx]«<ft, t = l,...,JVt (2) 

where [-} u is the ith diagonal element of a matrix. 

The cancelation of intra-cluster multiuser interference is done by applying BD, which is discussed in Section [III] The 
remaining inter-cluster interference plus noise covariance matrix at the fcth user of the cluster c is given by 
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Fig. 1. Block diagram of network MIMO coordination transmission strategy. 



inter-cluster interference 
cancelation 



where E [xgXg] = Sg x . 

To simplify the analysis, we have normalized the vectors in (Q]) dividing each by the standard deviation of the additive noise 
component, a. Completely removing the inter-cluster interference requires universal coordination between all surrounding 
clusters. The worst-case scenario for interference is when all surrounding clusters transmit at full allowed power BTl Theorem 
1]. Although this result is for the case with the total sum power constraint on the transmit antennas, it is used in our numerical 
results and it gives a pessimistic performance of the network MIMO coordination [38 1. Then, a pre- whitening filter can be 
applied to the system and as a result the inter-cluster interference in this case can be assumed spatially white [42|. The received 
signal for the fcth user in the cth cluster after post-processing can be simplified as 



y fe = H fc x + z fc , k=l,...,K. 
where z^ is the noise vector. For ease of notation, we dropped the cluster index c. 



(4) 



B. Capacity Region for Network MIMO Coordination 

The capacity region of a MIMO BC with sum power constraint has been previously discussed in |[T6l - lfT8l . The sum capacity 
of a Gaussian vector broadcast channel under per-antenna power constraint is the saddle-point of a minimax problem and it is 
shown to be equivalent to a dual MAC with linearly constrained noise [22|. The dual minimax problem is convex-concave and 
consequently the original downlink optimization problem can be solved globally in the dual domain. An efficient algorithm 
using Newton's method [23 1 is used in [22 1 to solve the dual minimax problem; it finds an efficient search direction for the 
simultaneous maximization and minimization. This capacity result is used to determine the sum capacity of the multi-base 
coordinated network and it constitutes the performance limit for the proposed transmission schemes. 

C. Transmission Strategy 

A block diagram of transmission strategy for network MIMO coordination is shown in Fig [TJ The transmitted symbol to 
user k is an n r -dimensional vector u^, which is multiplied by a Nt X n r precoding matrix W^ and passed on to the base 
station's antenna array. Since all base station antennas are coordinated, the complex antenna output vector x is composed of 
signals for all K users. Therefore, x can be written as follows 



K 

y^WfcUfc 

fe=l 



(5) 



where E[ufeU^] = I„ r . The received signal y/. at user k can be represented as 

y fe = H fc W fc u fc + Y, HfeWjUj + z fc , (6) 

where z^ ~ CAf(Q, In r ) denotes the normalized AWGN vector at user k. The random characteristics of channel matrix entries of 
Hfc are discussed in Section [TV] They encompass three factors: path loss, Rayleigh fading, and log-normal shadowing. Random 
structure of the channel coefficients ensures rank(Hfe) = min(n r , Nt) — n r for user k with probability one. Per-antenna power 
constraints (f2]i impose a power constraint 

[S x ]i,i =E[xx H ] v 



< 



t=l,.. 



,N t 



(7) 



on each transmit antenna. The sum power constraint also can be expressed as 

K 

tr{S x } = ^tr{w fc W£}<P. (8) 

fc=i 

Due to the structure of multiuser zero-forcing scheme, the number of users that can be served simultaneously in each time 
slot is limited. Hence, user selection algorithm is necessary. We consider two main criteria for the user selection scheme: 
maximum sum rate (MSR) and proportional fairness. We employ the greedy user selection algorithm discussed in ||43l , ll44ll . 
The proportionally-fair user selection algorithm is based on greedy weighted user selection algorithm with an update of the 
weights discussed in [45 1— [47 1. 

III. Multi-cell Multiuser Block Diagonalization 

To remove the intra-cluster interference, a practical linear zero-forcing can be employed. Applying the multiuser zero-forcing 
to the multiple-antenna users requires block diagonalization rather than channel inversion 1241 , Assuming the transmission 
strategy in Section Hl-CI each user's data life is precoded with the matrix Wfc, such that 

H fe W i =0 for all k^j and l<k,j<K. (9) 

Hence the received signal for user k can be simplified to 

y fe =H fc W fc u fe + n fe . (10) 

Let Hfc = [H~[ • ■ • H^j^H^j^ ■ • • H]^-] T . Zero-interference constraint in (0 forces Wfe to lie in the null space of Hfe which 
requires a dimension condition Bn t > Kn r to be satisfied. This directly comes from the definition of null space in linear 
algebra 11481 . Hence, the maximum number of users that can be served in a time slot is K = |_— J- We focus on the K users 
which are selected through a scheduling algorithm and assigned to one subband. The remaining unserved users are referred 
to other subbands or will be scheduled in other time slots. Recall that the vectors in (0 are normalized with respect to the 
standard deviation of the additive noise component, a, resulting in rife having components with unit variance. Assume that Hfe 
is a full rank matrix rank(Hfe) = (K — \)n r , which holds with probability one due to the randomness of entries of channel 
matrices. We perform singular value decomposition (SVD) 

H fc = U fc A fc [Y fc V fc ] T (11) 

where Yfe holds the first (K — \)n r right singular vectors corresponding to non-zero singular values, and Vfe G C * XTrv 
contains the last m r = Nt — (K — l)n r right singular vectors corresponding to zero singular values of Hfe. If number of 
scheduled users is K = — then m r = n r , otherwise m r > n r when K < — . The orthonormality of Vfe means that 
V^Vfc — I mr . The columns of V^ form a basis set in the null space of Hfe, and hence Wfe can be any linear combination 
of the columns of Vfe, i.e. 

Wfe=Vfe* fe! k = l,...,K (12) 

where S&fe G c«vxn r can ^ e an y aj-^itrary matrix subject to the per-antenna power constraints [34|. Conventional BD scheme 
proposed in 11241 assumes only linear combinations of a diagonal form to simplify it to a power allocation algorithm through 
water-filling. The conventional BD is optimal only when sum power constraint is applied [49 1, and it is not optimal under 
per-antenna power constraints 11271 . l30l . BT1 . 

A. Conventional BD 

In conventional BD [24], the sum power constraint is applied to the throughput maximization problem and further relaxed 
to a simple water-filling power allocation algorithm. In this scheme, the linear combination introduced in (TTZT i is confined to 
have a form given by 

*fe=Vfc©|, k = l,...,K (13) 

where Vfe G C" lrX " r are the right singular vectors of Hfe Vfe corresponding to its non-zero singular values. Hence, the aggregate 
precoding matrix of the conventional scheme, Wbd, is defined as 



W BD = 



VxVx V 2 V 2 ••• V K V K 



05 (14) 



where = bdiag [@i, • • • , &k] is a diagonal matrix whose elements scale the power transmitted into each of the columns 
of Wbd- The sum power constraint implies that 

K K 

^tr{VfeV fc 0feV^}=^tr{0fe} (15) 
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Fig. 2. Comparison of sum rates for conventional BD vs. the proposed optimal BD for B = 1, Nt = 6, 12, n r = 2 using maximum sum rate scheduling. 



This relaxes the problem to optimization over the diagonal terms of 0^ and consequently is interpreted as a power allocation 
problem and can be solved through well-known water-filling algorithm over the diagonal terms of 0. However, this form of 
BD cannot be extended as an optimal precoder to the case of per-antenna power constraints because 
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Note that ith diagonal term of the left side of ( fToT ) is a linear combination 
matrix and not only the diagonal terms. The selection of © as a diagonal matrix reduces the search domain 
size of optimization and hence does not lead to optimal solution. Furthermore, V^ impacts the diagonal terms of WbdWj D 
(i.e. transmission covariance matrix) and therefore insertion of V^ not necessarily reduces the required power allocated to each 
antenna. In addition it adds K SVD operations to the precoding computation procedure (one for each served users) to find Vfc. 
Additionally, the per-antenna power constraints do not allow the optimization to lead to simple water-filling algorithm. Previous 
work on BD with per-antenna (similarly with per-base-station) power constraints for a case of multiple-receive antennas employs 
this conventional BD and optimizes diagonal terms of Il26l - ll28l . Hence, it is not optimal. The optimal form of BD proposed 
in this paper includes the optimization over the entire null space of other users' channel matrices resulting in optimal precoders 
under per-antenna power constraints, easily extendable to per-base station power constraints. 

The numerical results in Fig. [2] compare maximized sum rate of a MIMO BC system with conventional BD ll24l and the 
optimal scheme proposed later in this paper. There are 12 transmit antennas at the base station and 2 receive antennas at each 
mobile user. B = 1 is considered to specifically show the difference between the two BD schemes. Note that the conventional 
BD has a domain of R + ' while the optimal BD searches over all possible K symmetric matrices of <&k and therefore has a 
larger domain of C + " r and grows when number of users per cell increases. As a consequence, the difference between 

these two schemes increases with the number of users per cell. Details of the simulation setup are given in Section [IV] In 
the following section the optimal BD scheme is introduced and discussed in detail, and the algorithm to find the precoders is 
presented. 



B. Optimal Multi-Cell BD 

The focus of this section is on the design of optimal multi-cell BD precoder matrices Wfe to maximize the throughput while 
enforcing per-antenna power constraints. In this scheme, we search over the entire null space of other users channel matrices 
(Hfc), i-e. ^k can be any arbitrary matrix of C'" r>! " r satisfying the per-antenna power constraints. 

Following the design of precoders according to ( fT2l . the received signal for user k can be expressed as 

y k =H fe V fe * fc Ufe + z fe . (17) 

Denote $/,. = SJ/^S!/^ e C"' rX, ™ r , k = l,...,K, which are positive semi-definite matrices. The rate of fcth user is given by 



Rk = log 



I + H fc V fc * fe V"H" 



Therefore, sum rate maximization problem can be expressed as 



maximize 
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EfcLi lo 

EE 



iV fc * fe V£ 



H fc V fc $ fc V£H£| 
<Pi, i = 



,N t 



(18) 



(19) 



* fe t 0, k = 1, 



,K, 



where the maximization is over all positive semi-definite matrices <&i,...,<&x with a rank constraint of rank(<frfc) < n r . 
Notice that the objective function in ([T9| i is concave IJ481 p. 466] and the constraints are also affine functions [23]. Thus, 
the problem is categorized as a convex optimization problem. We propose a gradient descent algorithm to find the optimal 
BD precoders. We define G^ = HfeVfe and correspondingly its right pseudo-inverse matrix as G k = G^ (GfeG^) . Let 
Qfe = VfeG^" 1 which is an N t X n r matrix and we perform the SVD Q^AQfc = UfcEfcU^. We introduce the positive 
semi-definite matrices flf. defined as 

n fc = UfcpG fc -I1 + Ug, (20) 



where the operator [D], = diag [max(0, d\), . . . , max(0, d n )] on a diagonal matrix D = diag \d\, ■ . . , d n ] 
Theorem 1: The optimal BD precoders can be obtained through solving the dual problem 

minimize <?(A) 

subject to A y 0, A diagonal 

where 

A 



(21) 
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(22) 



with a gradient descent direction given as 

AA 



A' 



: Y^ diag 
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The optimal BD precoders for the optimal value of A* is given as 



W fc =V, 
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Proof: The proof is given in the Appendix. 
The KKT conditions for the dual problem are given as 



AM), 
V A .9 h0, 
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(25) 



with the last condition being the complementarity ll23l p. 142]. Thus, the stopping criterion for the gradient descent method 
can be established using small values of e > replacing zero values. 

More interestingly, the sum rate maximization in (fT~9b through dual problem in (l38T l facilitates the extension to any linear 
power constraints on the transmit antennas. The dual problem has Nt variables Aj, i = 1, . . . , Nt, one for each transmit antenna 
power constraint. More general power constraints than those given in (TT9llcan be defined as ll3~T1 



trj^V fc * fc V£T ; l< Pi , 



1 = 1, 



L 



(26) 



where T; are positive semidefinite symmetric matrices and pi are non-negative values corresponding to each of L linear 
constraints. The special case of this structure of power constraints has been discussed frequently in the literature: for L = 1, 
pi = P and Ti = I the conventional sum power constraint results [24]; when L = N t and T; is a matrix with its Zth diagonal 
term equal to one and all other elements zero, we get per antenna power constraints studied in this section. Another scenario 
is per-base station power constraint, which is derived with L = B, pi = Pi (Zth per-base power limit) and T; all zero except 
equal to one on n t terms of its diagonal each corresponding to one of the Zth base station's transmit antennas. When the 
sum power constraint is applied only one dual variable is needed in dual optimization problem (f38l > (i.e. A = AI/v t ), where 
A determines the water level in the water-filling algorithm [24]. For per-base station power constraints, the optimization dual 
variable can be defined as A = A;, s ® I„ t , where A;, s = diag [Ai, . . . , As] consists of B dual variables (one for each base 
station) and the operator ® is the Kronecker product [48 1. The details of the optimization steps in the per-base station power 
constraints scenario are discussed in Section IIII-CI and the study of general linear constraints is left for further work. 



C. Per-Base-Station Power Constraints 

In this Section, the extension of the ZF beamforming optimization to the system with per-base station power constraint is 
considered. The optimization problem in ( fT9] > can be rewritten considering the per-base-station power constraints as 
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where Pi, ... , Pb are the per-base station maximum powers and A b is a diagonal matrix with its entries equal to one for the 
corresponding antennas within the base-station 6 and the rest equal to zero. For the simplicity, 6th n t -entries of the diagonal 
of A;, are only equal to one. Following similar steps as ( 1321 ), the Lagrange dual function is obtained as 
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where P bs = diag [Pi, ... , Pb] and <E> is the Kronecker product J48j. The KKT conditions yield that 
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where A bs — diag [Ai, . . . , Xb] and J~2fc can be defined in a similar way as d35l ). The dual problem can be expressed similarly 
as d38l . Following the steps in Section UlI-BI the gradient descent search direction is given by 
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where tr;, is a partial matrix trace over 6th 7i t -entries of the diagonal terms of a matrix. diag b=1 B [•] gives a diagonal matrix 
with B elements computed for each 6 = 1, . . . ,B. 



D. Single Antenna Receivers 

Although this paper studies a network MIMO system with multiple receive antenna users, the results can be applied to a 
system with single receive antenna users. In this case each user's transmission must be orthogonal to a vector (rather than 
a matrix), which is the basis vector for other users' transmissions. The optimization is over all real vectors with positive 
elements (ffi/v*) satisfying the power constraints. This approach facilitates the optimization presented in 1 30 ] and |31] using 
the generalized inverses and multi-step optimizations. 



IV. Simulation Setup 

The propagation model between each base station's transmit antenna and mobile user's receive antenna includes three factors: 
a path loss component proportional to d^ b (where d kb denotes distance from base station 6 to mobile user k and f3 = 3.8 is 
the path loss exponent), and two random components representing lognormal shadow fading and Rayleigh fading. The channel 
gain between transmit antenna t of the base station 6 and receive antenna r of the /cth user is given by 
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where [H^L ( * is the (r,t) element of the channel matrix Hk,b £ 



x k,b 



from the base station 6 to the mobile user 



CM (0,1) represents independent Rayleigh fading, do = 1 km is the cell radius, and p k ,b = 10 Pfc > 6 ' is the 



lognormal shadow fading variable between 6th base station and fcth user, where p\. b m> ~ CAf (0,a p ) and a p — 8 dB is its 




Fig. 3. The cellular layout of B = 3 and B = 7 clustered network MIMO coordination. The borders of clusters are bold. Green colored cells represent the 
analyzed center cluster and the grey cells are causing inter-cell interference. For B = 7, one tier of interfering clusters is considered, while for B = 3 two 
tiers of interfering cells are accounted for. 

standard deviation. A reference SNR, r = 20 dB is a typical value of the interference-free SNR at the cell boundary (as in 
and E8l). 

Our cellular network setup involves clustering. Since global coordination is not feasible, clustering with cluster sizes of up to 
B = 7 is considered. The cellular network layout is shown in Fig. [3] A base station is located at the center of each hexagonal 
cell. Each base station is equipped with nt transmit antennas. There are n r receive antennas on each user's receiver and there 
are K users per cell per subband. All Nt = Bn t base stations' transmit antennas in each cluster are coordinated. In Fig. [3] the 
clusters of sizes 3 and 7 are shown. For cluster size 7, one wrap-around layer of clusters is considered to contribute inter-cluster 
interference, while for B = 3 two tiers of interfering cells are accounted for. User locations are generated randomly, uniformly 
and independently in each cell. For each drop of users, the distance of users from base stations in the network is computed 
and path loss, lognormal and Rayleigh fading are included in the channel gain calculations. User scheduling is performed 
employing a greedy algorithm with maximum sum rate and proportionally-fair criteria with the updated weights for the rate 
of each user as in 1431 -1471. To compare the results all the sum rates achieved through network MIMO coordination are 
normalized by the size of clusters B. Base stations causing inter-cluster interference are assumed to transmit at full power, 
which is the worst case as discussed in Section HU 

V. Numerical Results 

In this section, the performance results (obtained via Monte Carlo simulations) of the proposed optimal BD scheme in a 
network MIMO coordinated system are discussed. The network MIMO coordination exhibits several system advantages, which 
are exposed in the following. 

A. Network MIMO Gains 

While the universal network MIMO coordination is practically impossible, clustering is a practical scheme, which also 
benefits the network MIMO coordination gains and reduces the amount of feedback required at the base stations (26), (381 . 
The size of clusters, B, is a parameter in network MIMO coordination. B = 1 means no coordination with optimal BD scheme 
applied. Fig. [4] shows that with increasing cluster size throughput of the system increases. System throughput is computed 
using MSR scheduling and averaged over several channel realizations for a large number of user locations generated randomly. 
The normalized throughput for different cluster sizes is compared, which means that the total throughput in each cluster is 
divided by the number of cells in each cluster B. The normalized sum rate has lower variance in larger clusters, which shows 
that the performance of the system is less dependent on the position of users and that network MIMO coordination brings 
more stability to the system. 

B. Multiple-Antenna Gains 

The inter-cell interference mitigation through coordination of base stations enables the cellular network to enjoy the great 
spectral efficiency improvement associated with employing multiple antennas. Fig. |3J shows the linear growth of the maximum 
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Fig. 4. CDF of sum rate with different cluster sizes B = 1, 3, 7, n t = 4, n r = 2 and 10 users per cell. 




Fig. 5. Sum-rate increase with the number of antennas per base station. n r = 2. 
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Fig. 6. Sum rate per cell achieved with the proposed optimal BD and the capacity limits of DPC for cluster sizes B = 1, 3, 7; nt = 4, n r = 2. 



throughput achievable through the proposed optimal multi-cell BD and the capacity limits of DPC Il22l . The number of receive 
antennas at each mobile user is fixed to n r = 2 and the number of transmit antennas n t at each base station is increasing. 
When the cluster size grows, the slope of spectral efficiency also increases. The maximum power on each transmit antenna is 
normalized such that total power at each base station for different n t is constant. 
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Fig. 7. CDF of the mean rates in the clusters of sizes B = 3, 7 and comparison with B = 1 (no coordination) using the proposed optimal BD. 
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Fig. 8. Convergence of the gradient descent method for the proposed optimal BD for B = 3, nt = 4, n r = 2, and 8 users per cell. 

C. Multiuser Diversity 

Multi-cell coordination benefits from increased multiuser diversity, since the number of users scheduled at each time interval 
is B times of that without coordination. In Fig. |6j the multiuser diversity gain of network MIMO is shown with up to 10 users 
per cell. The MSR scheduling is applied for each drop of users and averaged over several channel realizations. 

D. Fairness Advantages 

One of the main purposes of network MIMO coordination is that the cell-edge users gain from neighboring base stations 
signals. In Fig. [7] the cumulative distribution functions (CDFs) of mean rates for users are shown and compared for B = 1 (i.e. 
beamforming without coordination) and B = 3, 7 for the proposed optimal BD scheme. There are 10 users per cell randomly 
and uniformly dropped in the network for each simulation. For each drop of users, the proportionally fair scheduling algorithm 
is applied over hundreds of scheduling time intervals using sliding window width r = 10 time slots (see ifTTI ). Each user's 
rates achieved in all time intervals are averaged to find the mean rates per user and their CDF for several user locations is 
plotted. As shown by the plots, for B = 3 and B = 7 network MIMO coordination nearly 70% and 80% users have mean 
rate larger than 1 bps/Hz, respectively, while for the scheme without coordination it is 45% of users. However, fairness among 
users does not seem to be improved when cluster sizes increases. This is perhaps due to the existence of larger number of 
cell-edge users when cluster size increases. 



E. Convergence 

Convergence of the gradient descent method proposed in Section IIII-BI is illustrated in Fig. [8] The normalized sum rates 
obtained after each iteration with respect to the optimal target values versus the number of iterations are depicted. The 
convergence behavior of the algorithm for 20 independent and randomly generated user location sets is shown, and their 
channel realizations are tested with the proposed iterative algorithm and the values of sum rate after each iteration divided by 
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the target value are monitored. Nearly all of the optimizations converge to the target value within only 10 first iterations with 
1% error. 

VI. Conclusions 

In this paper, a multi-cell coordinated downlink MIMO transmission has been considered under per-antenna power constraints. 
Sub-optimality of the conventional BD considered in earlier research has been shown and it has motivated the search for the 
optimal BD scheme. The optimal block diagonalization (BD) scheme for network MIMO coordinated system under per-antenna 
power constraints has been proposed in this paper and it has been shown that it can be generalized to the case of per-base 
station power constraints. The simple iterative descent gradient algorithm employed in this paper gives the optimal precoders for 
multi-cell BD. The comprehensive simulation results have demonstrated advantages achieved by using multi-cell coordinated 
transmission under more practical per-antenna power constraints. 

Appendix 
Proof of Theorem 1 

We consider the optimization problem (fT9] >. For the ease of further analysis, let us substitute S k = HfcVfc^.V^H^ and 
Gfc = HfeVfe, where rank(Gfc) < n r . Note that the rank constraint on <& k must be inserted into the optimization when 
m r > n r , and hence it makes the problem non-convex. Thus, to analyze this problem two cases are considered based on the 
value of m r with respect to n r . In the first case m r — n r , when the total number of transmit antennas at all base stations, N t , 
is equal to the total number of receive antennas at all K served users, N r . In the second case N t > N r . 



A. N t = N r 



N t 



This happens when exactly K = — users are scheduled. In this case, the rank constraint over &k can be dropped because 
m r = n r and therefore the optimization problem in (fT9] i is convex. The matrices Gf. are also square and invertible. Therefore 
CI = G^ 1 . Let Qfe = VfcG^ 1 which is an N t x n r matrix. Thus, the throughput maximization problem can be expressed as 
(since S fc h <t=> Gr^Gr") 



K 



maximize J2k=i 1°S |I + S 



subject to 



Z)fe=i QfcSfcQfc 
S fe h 0, k = 1, 



< Pi, i = 1, 
,K, 



,N t 



(32) 



where S^ G C"" rXnr , Although one possibility is to perform this convex optimization with Kn r (n r — 1)/2 variables introducing 
logarithmic barrier functions for inequality power constraints and the set of positive semi-definite constraints, we approach the 
problem by establishing the dual problem and solving it through simple and efficient gradient descent method [23|. Hence, the 
Lagrangian function can be formed as 



K 

£({S};A)=y>g|I- 



fc=i 



S fe |+]Ttr{n fc S fc } 



fc=i 



A" 



tr A ^ Q fc S fc Q£ 



(33) 



\k=l 



where A = diag(Ai, . . . , Ajv t ) is a dual variable which is a diagonal matrix with non-negative elements, A, > 0. The positive 
semi-definite matrix fi^ is a dual variable to assure positive semi-definiteness of S^. The Karush-Kuhn-Tucker (KKT) conditions 
require that the optimal values of primal and dual variables [23 1 satisfy the following 

S k = (Q£AQ fc 

s fe ^o, 

tr{O fc S fe } = 0,n fc ^0 



^ 



A' 



tr I A [J2 QkS k Q H k 



0,A^0 



\k=l 



P >z diag 



K 



2 t QfcSfcQfe 



,fc=i 



(34) 



Let the SVD of Q^AQ fc = 

first KKT condition on Sfc 



= UfcSfcU^. Since QjJAQfc > 0, the diagonal entries of S^ are the eigenvalues of QjJAQfc. The 
and $7fc requires that 



n,, 



u fc rs t - ii , u: 



(35) 



12 



where the operator [D], = diag [max(0, di), . . . , max(0, d n )] on a diagonal matrix D = diag [di, . 
KKT condition corresponding to the power constraints gives 



, d n ]. Replacing these in 



tr JaQ^ Q fc S fe Q"-pj I =Kn r - tr {AP} 

r K 



(36) 



,fc=i 



Now, we establish the Lagrange dual function as 

, 9 (A) = sup £({S}) 



s,, 



A 



fc=l 



^log|Q^AQ fc -n 

Q^AQ fc - O fc 



Kn r 




tr{AP}. 



(37) 



(38) 



Since the constraint functions are affine, strong duality holds and thus the dual objective reaches a minimum at the optimal 
value of the primal problem [23 1 . As a result, the Lagrange dual problem can be stated as 

minimize g (A) 

subject to A y 0, A diagonal 

The gradient of g can be obtained from j37] i as 

A r 

J2 diag Q fc (Ql?AQ fc - O fc 



V A g 



fc=i 



q£ 



K 

■E 

^=1 



diag 



QfcQfe 



(39) 



This gives a descent search direction, A A = — Va.9, for the gradient algorithm for the Lagrange dual problem 



B. N t > N r 

When the total number of transmit antennas is strictly larger than the total number of receive antennas in the network (i.e. 
N t > N r ) the optimization problem in (f32T > is no longer convex due to the rank constraints. We relax the problem and show 
that it leads to an optimal solution, which also satisfies the rank constraints in the original problem. Similar gradient algorithm 
to the one for N t = N r can be deployed to find the optimal BD precoders. 

Recall that m r — N t — (K — l)n r . Thus, when the total number of transmit antennas is strictly larger than the total number 
of receive antennas, N t > N r , then m r > n r . From Section [III] note that V& is an N t x m r matrix and correspondingly the size 
of Sl/fc is m r x n r which enforces a rank constraint over $^. = M/ ^ M/ ^ . (i.e. rank(«J?fc) < n r ). This updates the optimization 
in ([T9i l by adding the rank constraints as 



maximize ^ fc=1 log |l 
subject to 



K 

fc=i 

A 



-H fc V fe * fe V£H£| 



Ef =1 v fe * fe vH 



(40) 



< Pl , i = l,...,N t 
®kh0, rank(* fc ) < n r k = l,...,K. 
The problem above is not convex due to the rank constraint. Assume the convex relaxation problem obtained by removing the 



rank constraint. The problem can then be expressed as 



^K 



,N t 



maximize J2k=i lo S I 1 + H fe V fe * fc V£H£| 

subject to J2k=i Vfc* fc V£ < Pi , i = 1, . 

* fe ^0, k = l,.".,K 

Since this problem is convex and the constraints are affine, any solution satisfying the KKT conditions is optimal 
introduce an optimization problem 



maximize ^ A;=1 log |I + S^ 
subject to 



Ef=iV fe GlS fe (Gt) V h 
,K 



<Pi, 



(41) 



Let us 



(42) 



1. 



Sfe^O 



k = l. 
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Assume the optimal solutions for this problem are S£s. Defining 3>& — Gj,S£, I Gj. j , satisfies all the KKT conditions for 

fiTl i. since Gi^^G^ = S£. Furthermore, rank(<&fc) = rank(Sfc) < n r which also satisfies the rank constraint in the original 
optimization problem (0O]l. Note that also $i> « S^ (see gU p. 399]). 

The optimization in (l42l i is equivalent to the convex optimization problem in d32l l by replacing Qfc = VfcGl. Recall that 
when m r — n r then the matrix Gfc is square and invertible. Hence, Qfc = VfcGl" , as defined in Section|A] Thus, this problem 
can be solved through the gradient descent method applied to the dual problem ( T38l > with the gradient descent search direction 
d39l . The stopping criterion is also the same as ( 1251 ) except that Qfc has different definition. 

Note that d24l > can be simply concluded from the first equation of the KKT conditions ( f34b and the definition of <l>fc = 

Gj,S£ (Gj. j for the optimal value of dual variables A*. 
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